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Abstract 

A framework for causal inference from two-level factorial designs is proposed. The framework 
utilizes the concept of potential outcomes that lies at the center stage of causal inference and ex- 
tends Neyman's repeated sampling approach for estimation of causal effects and randomization 
tests based on Fisher's sharp null hypothesis to the case of 2-level factorial experiments. The 
framework allows for statistical inference from a finite population, permits definition and esti- 
mation of estimands other than "average factorial effects" and leads to more flexible inference 
procedures than those based on ordinary least squares estimation from a linear model. 
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1 Introduction 

Factorial designs were originally developed in the context of agricultural experiments (Yates 1937, 
Fisher 1942). These designs allow the relative effects of several factors and their interactions to 
be assessed efficiently. A 1 K factorial design involves 2 K treatment combinations arising from 
K factors, each at two levels. The estimands, or objects of interest, in 2 K factorial designs are 
typically the 2 K — 1 factorial effects, which are essentially causal effects of the experimental factors 



on the response of interest (say 1"), and include K main effects and (^") j-factor interactions for 
j = 2,...,K. 

It is surprising that the estimands have typically received less attention in factorial design 
literature compared to their estimators. In several textbooks on experimental design (e.g. Wu 
and Hamada 2009, Ch. 4), estimated factorial effects are first defined as orthogonal contrasts of 
the observed values of the response. The estimands are then defined as the expectation of these 
estimators over a hypothetical super population of interest, from which the experimental units are 
assumed to be randomly sampled. More precisely, the objects of inference are described in terms 
of the parameters of a regression model of the observed response on the design matrix with an 
additive error. The inference for factorial effects is, therefore, typically based on a linear model 
or a generalized linear model (GLM) framework depending on whether the observed response is 
normally or non-normally distributed. 

The linear model based framework for statistical analysis of factorial experiments, although 
used in most factorial design applications (see, for example, Box Hunter and Hunter 2005, Wu and 
Hamada 2009), suffers from the following inherent drawbacks: 

1. It defines the causal estimands as a parameter of the probability distribution of the observed 
response, whereas an estimand should ideally be based on the scientific goals of the experi- 
ment. 

2. It does not relate the estimands to the population of interest, and more specifically, to the 
experimental units that constitute the population. For example, it fails to distinguish between 
situations where the population of interest is a hypothetical super population and infinite 
(e.g., generated by a manufacturing process) and finite (e.g., a study that pertains to a well- 
defined set of schools in New York) . Whereas the model may make sense under sampling of 
experimental units from a super population, it is more difficult to interpret the parameters 
as objects of interest for a finite population. 

3. Because the linear model does not include a treatment assignment variable, it needs to be 
re-defined under every distinct randomization scheme (e.g., typically random effects are in- 
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troduced for split-plot models (Wu and Hamada 2009)). 

4. It does not reveal how the estimation will be affected if additivity does not hold, i.e., if the 
effects of treatment combinations vary across experimental units. 

A natural question that arises at this point is: why have recent developments in the field of fac- 
torial experiments ignored the limitations stated above? A plausible explanation emerges from the 
fact that most of the recent theoretical developments have been triggered by industrial applications, 
and the linear-model-based framework works well in most typical industrial experiments. First, the 
experimental units are often exchangeable due to controlled experimental conditions, making the 
assumption of constant treatment effects a realistic one. Second, the population of interest is typ- 
ically infinite and hypothetical, because the experimenter's interest lies in the future selection of 
optimal conditions identified through experimentation. Third, the average treatment effect over the 
hypothetical population of interest is useful and easily interpretable in most industrial applications. 

Unfortunately, due to the above limitations, factorial designs, although being widely used for 
industrial experimentation, have found few applications in social, behavioral and biomedical exper- 
iments, where most of the aforementioned complexities are predominant. Modern-day technology is 
also making industrial experimentation increasingly complex. Several industrial experiments now 
have complex randomization structures. Large inherent variation among experimental units, e.g., 
substrates in nanotechnology, are reportedly leading to poor reproducibility of experimental results 
(Dasgupta et al. 2008). Considering the potential to apply factorial designs in social, medical and 
behavioral experiments (Chakraborty et al. 2009, Collins et al. 2009) and their ever-increasing 
need in engineering and industrial experimentation, a unified framework addressing the limitations 
of the linear-model based framework for inference appears highly desirable. 

In this article, we propose a framework for causal inference from 2 K factorial design to address 
these deficiencies. The framework utilizes the concept of potential outcomes that lies at the center 
stage of causal inference (Rubin 2008). Although such a framework for single- factor experiments 
with two levels is well-developed and popularly known as the Rubin Causal Model (RCM, Holland 
1986), it is not yet developed for multiple-factor experiments. We consider 2 K designs because they 
can be studied as natural extensions of treatment-control studies with a single factor. 
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The article is organized as follows. In Section 2 we briefly review the RCM for randomized 
experiments and observational studies involving a single factor at two levels and discuss it from 
a historical perspective. In Section 3 we develop the basic framework for extending the RCM to 
2 K designs by defining factorial effects in terms of the potential outcomes, and propose a response 
model that is random only through the assignment mechanism. In Sections 4 and 5 respectively, 
we extend Neymanian (1923) and Fisherian (1925) randomization-based inference to 2 K designs. 
In Section 6 we develop a more flexible Bayesian approach for causal inference from 2 K designs. 
In Section 7, we demonstrate the advantages of the approach with a practical example. Some 
concluding remarks are given in Section 8. 

2 A brief review of potential outcomes, RCM and its evolution 

The first formal notation for potential outcomes was introduced by Neyman (1923) for randomization- 
based inference in randomized experiments, and subsequently used by Kempthorne (1952) and Cox 
(1958) for causal inference from randomized experiments. The concept was formalized and extended 
by Rubin (1974, 1975, 1977, 1978) for other forms of causal inference from randomized experiments 
and observational studies, and exposition of this transition appears in Rubin (2010). 

The early evolution of the RCM was motivated by the need for a clear separation between the 
object of inference - often referred to as the science - and what researchers do to learn about the 
science (e.g., randomly assign treatments to units). In the context of causal inference, the science is 
a matrix where the rows represent N units, which are physical objects at a particular point of time, 
and the columns represent potential outcomes under each possible exposure. Thus, for a study 
with one factor at two levels represented by 1 and 0, each row of the science can be written as 
[Yi(l), ^i(O)], where Yi(x) is the potential outcome of unit i if unit % receives treatment x, x € {0, 1}, 
indicated by Wi(x) = 1. The representation of the science is adequate under Stable unit treatment 
value assumption (SUTVA) as defined by Rubin (1980). 

The causal effect of Treatment 1 versus Treatment for the ith unit is the comparison of the 
corresponding potential outcomes for that unit: 1^(1) versus 1^(0) (e.g., their difference or their 
ratio). The "fundamental problem facing inference for causal effects" (Rubin 1978) is that only 
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one of the potential outcomes can ever be observed for each unit. Therefore, unit-level causal 
effects cannot be known and must be inferred. The RCM permits prediction of unit-level causal 
effects from either the Neymanian (1923/1990) perspective or from the Bayesian perspective (Rubin 
1978), although such estimations are generally imprecise relative to the estimation of population 
or subpopulation causal effects. 

Although the average causal effects Y(l) — Y(0) are common estimands in many fields of ap- 
plication, other summaries of unit-level causal effects may also be of interest in specific scientific 
studies. As has been emphasized by Rubin (1974-2010), there is no reason to focus solely on av- 
erage causal effects, although this quantity is especially easy to estimate unbiasedly with standard 
statistical tools in randomized experiments under simple assumptions. 

Rubin (2010) describes the RCM in terms of three legs - the first being to define causal effects 
by using potential outcomes (define the science), the second being to describe the process by which 
some potential outcomes will be revealed (the assignment mechanism), and the third being the 
Bayesian posterior predictive distribution of the missing potential outcomes. 



3 Extending the RCM to 2 K factorial designs 

3.1 Defining factorial effects in the potential outcomes framework for a finite 
population 

Let the K factors be indexed by k, and let z denote a particular treatment combination represented 
by a K-dimensional vector of -l's and l's, where the fcth element indicates whether the kth factor 
is at its low level (-1) or at its high level (+1), k = 1, . . . ,K. The number of possible values of 
z is J = 2 K ; and we denote the set of all J treatment combinations by Z. Let Yi(z) denote the 
potential outcome of the ith unit if exposed to treatment z. Note that when introducing this 
notation, we implicitly make the stable unit treatment value assumption (SUTVA), which means 
that the potential outcome of a particular unit depends only on the treatment combination it is 
assigned, and NOT on the assignments of the remaining units, and that there are no hidden versions 
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of treatments not represented by the J values of z. All potential outcomes for unit i comprise the 
vector Y ; L of dimension J. Finally, we define the science as the N x 2 K matrix Y of potential 
outcomes in which the ith row is the J- vector Y \, i = 1, . . . , N, and assume that N = r2 K , where 
r is an integer representing the number of replications of each treatment combination. 

We are interested in contrasting, for each unit, one half of the potential outcomes with the other 
half of the potential outcomes. For example, the difference of the averages of the potential outcomes 
when factor 1 is at its high level and at its low level, the so-called "main effect of factor 1". Of 
course, other definitions could be the difference in medians, or the difference in the logarithm of 
the averages, but the tradition in the study of factorial experiments is to deal with the difference of 
averages, and this is the situation we focus on here. A factorial effect for each unit is the difference 
in the averages between one half of the potential outcomes and the other half. Therefore a factorial 
effect for unit i can be defined by a vector of dimension J, say g, with one half of its elements 
equal to -1, and one half equal to +1. Dividing the inner product of vector g with vector Y \ by 

2 K-1 

we obtain the g-factorial effect for unit i. There are J — 1 such vectors g, which we index 
by j = 1, . . . , J — 1 and order as follows. The first, g 1: generates the main effect of factor 1; g 2 
generates the main effect of factor 2; and proceeding this way, g K generates the main effect of 
factor K. The definition of the main effect of factor j for unit i, in terms of potential outcomes is 
therefore nj = 2^-^ Y \, i = 1, . . . , N; j = 1, . . . , K. 

After the main effects, the next (^f) vectors gj generate the (^f) two-factor interactions between 
factors for unit i. The first generates the interaction between factor 1 and factor 2, the second 
generates the interaction between factor 1 and factor 3, etc., in the analogous order, moving on to 
the two-way interaction between factor K— 1 and factor K. The definition of the two-way interaction 
between factor k and factor k' for unit i is therefore r^- = 2~^ K ~ l "> g'-Y i, i = 1,...,N; j = 

K + 1 , K+ (^) , where each element of gj is obtained by multiplying the corresponding elements 

of the g-vectors representing the main effects of factors k and k' . 

The next (^) values of gj generate the (^) three-way interactions for unit i, first between factor 
1, factor 2, and factor 3; followed by between factor 1, factor 2, and factor 4; etc. Each g-vector 
representing an interaction between three factors is generated by multiplying the three g-vectors 
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representing the main effects of these factors element- wise. Proceeding in a similar manner, we 
finally end with the i^-way interaction between all of the of factors, generated by a vector labeled 
as g j i- For completeness and later use, we label the vector generating the ith unit's mean potential 
outcome as g = (1, . . . , 1), so that we have J values of gj, j = 0, . . . , J — 1. 

A general factorial effect for unit i defined by vector g^ will be denoted by r^, which is simply 
a linear combination of unit i'a potential outcomes and can be expressed as 

nj = 2-^- 1 ) f ,;.y i , % = i, . . . , n, j = i, . . . , j - 1. (i) 

The average of these N unit-level factorial effects across the iV units for each possible gj define 
the J — 1 factorial effect estimands in the finite population: 

^ = ^E^ = 2-^-V i V, (2) 

where 

1 - 

* = nT, Y *' (3) 

i=i 

The factorial effects fj defined by ([2]) are the finite-population estimands in the statistical 
inference problem. Note that apart from defining the average factorial effect fj, the potential 
outcomes model and the definition of unit-level factorial effects permit defining the median of 
unit-level factorial effects or the quantiles of the distribution of the unit-level factorial effects as 
population-level factorial effects. Such population-level effects may be more meaningful estimands 
in certain scientific queries compared to the conventional average factorial effects. 

Example: Consider a 2 2 factorial experiment, where the potential outcomes Y~i for the ith 
unit are Yi(— 1, — 1, — 1). The main effects of factors 1 and 2, respectively, 

are represented by vectors g x = (—1,-1,1,1)', #2 = ( — 1, 1, — 1, 1)', and the two-way interaction 
between factors 1 and 2 is represented by the vector gr 3 = (1, — 1, 1, — 1)'. The three unit-level 
factorial effects t^, t$2 and r%z are given by 2~ 1 g' 1 Yi, 2~ 1 g' 2 Yi and 2~ 1 g' 3 Yi, and the corresponding 
population-level factorial effects f.i,f.2 and f.3 are 2~ 1 g' 1 Y, 2~ 1 g' 2 Y and 2~ 1 g' 3 Y, where Y is given 
by ©. 
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The definitions of unit-level and finite population-level factorial effects can easily be extended 
to define factorial effects for a hypothetical super population (Imbens and Rubin 201 f, Ch. 6) by 
letting the population size N grow infinitely large, keeping the number of factors K fixed. Super 
population-level factorial effects are defined as 



1 N 

f^ p = lim Tj = lim — > Tj,-, (4) 



1=1 

where Tjj and r j are defined by (pQ) and (J2J) respectively. 

3.2 Assignment mechanism, observed outcomes and unbiased estimation of 
treatment means 

As observed repeatedly by Rubin (2005, 2008), statistical inference for causal effects requires the 
specification of an assignment mechanism; a probabilistic model for how experimental units are al- 
located to the treatment combinations. For 2 K factorial designs, we define the following assignment 
variables: 

1 if the ith. unit is assigned to z 
otherwise. 

Clearly, there are 2 k N such assignment variables satisfying the following conditions: 



Wi(z) 



J^Wiiz) = 1, i = l,...,N, (5) 
z 

N 

^Wi{z) = r, for all z. (6) 

i=i 

Condition ([S]) follows from the fact that every experimental unit can be allocated to one and only 
one treatment combination, whereas ([6]) specifies the number of replications (r) for each treat- 
ment combination. Depending on the treatment assignment mechanism, Wi(z) can have differ- 
ent probability distributions. We assume a completely randomized treatment assignment so that 
E(Wi(x)\Y) = l/2 K for all z, where E{-\Y) is the expectation over all possible randomizations. 

Denote the observed outcome corresponding to the ith experimental unit by Yj° bs , i = 1, . . . , N, 
so that 

Y ^ = Y^W l {z)Y i {z). (7) 



A natural estimator of Y(z), the population mean for treatment combination z, is 

1 1 N 

Y° hs (z) = - Y t ohs = -Y,W l {z)Y i {z), (8) 

i:Wi(Z)=l i=l 

which is an unbiased estimator for Y(z) because E(Wi{z)\Y) = l/2 k = r/N. The following three 
lemmas are used to derive important sampling properties of Y ohs (z), and consequently, those of 
the estimated factorial factorial effects, defined in Section [H The first two lemmas concern the 
assignment variable Wi(z). Define 



Di(z) = Wi{z) - -, 



(9) 



where E(Di{z)\Y) = 0. Then, 



Lemma 3.1. For i,i' = 1, . . . , N and any treatment combination z, 



E(Di(z),D if (z)\Y) = { 



r(N-r) ., • _ ., 

— jp — y 1 — 1 

-r(N-r) ., • , ., 

k N 2 {N-i) 'J 'r 1 



Lemma 3.2. For i,i' = 1,...,N and any two distinct treatment combinations z and z* , i.e., 



z^z* 



E(D i (z),D i ,(z*)\Y) = < 



— r 



if i = i' 
\ N 2 (N-1) */ * ^ * 



Lemma 3.3. Let 



N 



S 2 (z) = Y J (Yi(z)-Y(z)f/(N-l), 



(10) 



8=1 



denote the variance of all potential outcomes (with divisor N — 1) for treatment combination z, and 



N 



S 2 (z, z*) = - Y(z))(Y(z*) - Y(z*))/(N - 1), 



(11) 



i=l 



denote the covariance (with divisor N — 1) among pairs of potential outcomes for two distinct 
treatment combinations z and z* . Then, S 2 (z) and S 2 (z,z*) can be expressed in the following 



alternative forms: 

N N N 



* 2 (*) = ^[E^-^E E 

i=l i=l i'=X,i'^i 

iV Af 



i=l i=l i'=l,i'=£i 



By application of Lemmas 13.11 - 13.31 it follows that 

yy" I- 

Var(Y obs (*)|y) = — ^^ 2 (z), for all z, (12) 
Cov(Y obs (z),Y obs (z)*|F) = -1^(2, z*) for all z + z* , (13) 

where S (z) and S 2 (z,z*) are defined in Lemma 13.31 

We conclude this section by observing that an unbiased estimator of any population-level average 
factorial effect fj given by ([2]) can be obtained by replacing the treatment mean Y(z) by Y ohs (z). 
We will now formally define this estimator and study its repeated sampling properties. 



4 Neymanian randomization-based inference for 2 factorial de- 
signs 

As observed by Rubin (2008), "Neyman's form of randomization-based inference can be viewed 
as drawing inferences by evaluating the expectations of statistics over the distribution induced by 
the assignment mechanism in order to calculate a confidence interval for the typical causal effect." 
Here we study the sampling properties of the estimated factorial effects f.j defined as 

f- = 2-(K^)g' jY 0bS : (14) 

where Y ohs i s the J-vector of average observed outcomes, and investigate the effect of additivity 
on the sampling properties under the Neymanian framework. Now, we state two theorems (proofs 
in the Appendix) that summarize the sampling properties of estimated average factorial effects. 
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Theorem 4.1. For a completely randomized treatment assignment mechanism, the estimator f.j 
given by (14\ ) is an unbiased estimator of the factorial effect f.j, that is, averaging over the ran- 
domization distribution of the treatment assignment variables. 

Theorem 4.2. For a completely randomized treatment assignment mechanism, the sampling vari- 
ance of f.j is 

2 2(K-l) r Y1 S2 ( Z ) ~ Jf S P ( 15 ) 

where S 2 (z) is given by [W\) , and 

1 N 

5 f = ^tEn-^) 2 ' ( 16 ) 

denotes the population variance of all unit level factorial effects Tij, i = 1, . . . , N. 



Although Theorem 14.11 is intuitive and straightforward, Theorem 14.21 extends Neyman's (1923) 
results to 2 K factorial designs and provides us with a way to assess the effect of additivity on the 
sampling variance of an estimated factorial effect. Clearly, the term in the right hand side of 
(|15p cannot be estimated from the experimental data because none of unit-level factorial effects 
are observable. Let 

= < 17 > 

be the finite population correlation coefficient among all potential outcomes for treatment combi- 
nations z and z* , where S 2 (z) and S 2 (z,z*) are given by (jlOp and (jlip respectively. For the jth 
factorial effect, define the subsets Z^,ZJ C Z by 

Z+ = {z € Z : jth element of z = +1}, Zj = {z £ Z : jth element of z = -1} . (18) 

Thus 7L^ and ZJ partition the set Z into two disjoint sets of treatment combinations. It can be 
shown after some straightforward algebraic manipulations that 



S] = ^iy[£s 2 (2) + 2[ £ £ P (z,z*)S(z)S(z*) 

+ E E p(*,z*)S(z)S(z*)+ £ E p(z,z*)S(z)S(z*)]]. (19) 



zezj z*eZ" zezj z*& 
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Recall that two distinct treatment combinations z and z* are said to have additive effects if 
the difference of the unit-level potential outcomes Yi{z) — Yi(z*) is the same for each i = 1, . . . , N. 
Then a set of necessary and sufficient conditions for "strict" additivity (i.e., additivity of effects for 
all treatment combinations) are: (i) S 2 (z) = S 2 and (ii) p(z,z*) = 1 for all z,z*. It immediately 
follows from (fl~9j) that a necessary and sufficient condition for strict additivity is: S 2 = 0. Thus, 
substituting S 2 (z) = S 2 and S 2 = into ^E), we arrive at the following corollary 

Corollary 4.2.1. Under strict additivity of treatment effects, for any factorial effect f,j, the sam- 
pling variance off j is 

VarfolY) = ^S 2 , (20) 

where S 2 is the common population variance of potential outcomes for each treatment combination 
z. 

It is well known (e.g., Wu and Hamada 2009 Chapter 4) that under the assumption of a linear 
model with additive independent and identically distributed errors with common variance a 2 , the 
sampling variance of estimators of super population-level estimands fj is (4/n)<r 2 , where n is a 
finite sample of units chosen randomly for the experiment from the super population. Here we 
show that an analogous result holds for estimators of finite population-level estimands under the 
strong assumption of strict additivity. 

Next, to explore the effect of non-additivity on sampling variances, we consider a special case 
where S 2 (z) = S 2 and p{z,z*) = p{< 1) for all z,z*, which is known as Compound Symmetry of 
the variance-covariance matrix. Substituting the two conditions into (|19p . we have 

S] = ^a-p)S 2 , (21) 
and consequently obtain the following corollary: 

Corollary 4.2.2. If non-additivity of treatment effects is characterized by compound symmetry of 
the potential outcomes, then the sampling variance of fj is 

Var(?,|r) = l(l-l^)S 2 . (22) 
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Under the assumption of compound symmetry of the potential outcomes, to ensure positive 
definiteness of the 2 K x 2 K matrix of population variances, S 2 (z) and population covariances 
S 2 (z,z*), we must have p > —1/(2 K — 1). Therefore, from (j20|) and (|22p . a third corollary on the 
bounds of Var(fj) can immediately be established: 

Corollary 4.2.3. Under compound symmetry, the sampling variance of each estimated factorial 
effect satisfies the following bounds: 

^(1 " ^TI^ 2 < Var(?,|F) < ^S 2 . (23) 



Two interesting observations follow from Corollaries 14.2.21 and HT2T31 First, Var(f.j) is largest 
under strict additivity, as observed by Imbens and Rubin (2012, Ch. 6) for a single-factor experi- 
ment. Second, under compound symmetry, the effect of non-additivity gets smaller as the number 
of factors K increases. 

As an illustration of how the above results can help understand the effect of lack of additivity 
on the sampling variance of estimated factorial effects, consider the case of a 2 2 design with the four 
treatment combinations (-1,-1), (-1,1), (1,-1) an d (1,1)- Then under strict additivity, the estimators 
of the causal effects r.i,r.2 and r.3 have the same sampling variance S 2 /r, where r = A/4 denotes 
the number of replications. Under the assumption of compound symmetry, the estimators have the 
same variance: (3 + p)S 2 /(Ar). 

The assumption of a common correlation among potential outcomes of all pairs of treatment 
combinations is strong. A more realistic potential outcomes model may be one that assumes 
different correlations associated with different factors. Assume that, in a 2 2 experiment, the po- 
tential outcomes are correlated only if they correspond to a fixed level of factor 1, and are un- 
corrected otherwise. This means, out of the six possible pairs of potential outcomes, only two - 
(y(— 1, — 1), Y(— 1, 1)) and (Y(l, — 1), Y(l, 1)) - have a non-zero correlation coefficient p, and the 
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matrix R of all correlation coefficients p(z,z*) has the following structure: 

' 1 p ^ 

p 1 

R= . (24) 

1 p 

V o o P 1 / 

Then, by successive application of (|19p and Theorem l4.2l it follows that the sampling variance of f.i 
equals (3+p)S 2 / (4r), whereas the sampling variances of f .2 and f .3 are both equal to (3 — p)S 2 / (4r). 
This example makes it clear that the sampling variances of estimated factorial effects can be severely 
affected by lack of additivity, and may be different depending on the structure of the correlation 
matrix R. Unfortunately, without additional information or assumptions, there is no way one can 
estimate the correlation matrix from data. 

So far we have considered the sampling distribution of estimated factorial effects individually. 
For factorial designs, it is also important to study the joint distribution of all the estimated factorial 
effects. The following result gives an expression for the sampling covariance between two estimated 
factorial effects for a finite population. 

Theorem 4.3. Let f j and f.ji denote two distinct factorial effects. For a completely randomized 
treatment assignment mechanism, the covariance between the estimated factorial effects f.j and f.ji 
is 



1 



£ S\z)- £ S 2 (z)- £ S 2 (z) + £ S\z) 



ZGZ7HZ+ 



2 2(K-l) r 

where S 2 (z) is given by i fTOj) . and 



zez+nzj, 



zez+nzj; 



l s 2 



N 



(25) 



denotes the finite population covariance between all unit level factorial effects Tij and t^i, i = 
1,...,N. 

Note that Cov(r.j, f „•/) has a form that is very similar to the expression for Var(r.j) obtained in 
Theorem 14.21 in the sense that it involves the term S 2 j, which is non-estimable like S 2 in Theorem 
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B~2l Noting that under strict additivity, S 2 (z) = S 2 for all z G Z and S 2 -, = for all j / /, we 
have the following corollary: 

Corollary 4.3.1. For all j ^ f , estimated factorial effects f j and f ji are uncorrelated under 
strict additivity. 

To explore the effect of non-additivity on Cov(f .j, f ,,•/), we again consider the case of compound 
symmetry. The following corollary can be established after some algebraic manipulations (proof in 
Appendix). 

Corollary 4.3.2. For all j ^ j' , estimated factorial effects f_j and f j/ are uncorrelated under 
compound symmetry. 

Estimation of variance and covariance and inference on factorial effects 

The Neymanian inference procedure depends on the estimation of the factorial effects and their 
sampling variances and covariances. To estimate the sampling variances and covariances, we first 
seek an unbiased estimator of S 2 (z). Clearly, the sample variance for the treatment combination 
z, given by 

s 2 (z) = -±- _ Y° b *(z)) 2 , (26) 

i 

is an unbiased estimator of S 2 (z). Inserting s 2 {z) in place of S 2 (z), the Neymanian estimator of 
Var(r. j) can thus be obtained from Theorem 14. 2 1 as: 

Var Ncy (?.,|lO = —^-Y. s2 ( z )- ( 27 ) 

z 

Because S 2 is not observable and estimable without assumptions, \ai^ ey {f .j) in expectation is an 
upper bound on the true sampling variance of f,j. The bias S 2 /N depends on the correlation matrix 
R of potential outcomes defined in (|24p . However, from Theorem 3, it follows that the estimator 



Cov Ney (?. j ,?. J v|F) = -^J- iy -[ ]T s\z)- s2 W- E s2 W+ E ^W]-(28) 

can overestimate or underestimate Cov(f f j/\Y) depending on whether S 2 -, is positive or nega- 
tive. 
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Let r denote the vector of the J — 1 population-level factorial effects. To draw inference about 
r, consider the statistic: 

T N = r't^f, (29) 

where r is the Neymanian estimator of r; denotes the true covariance matrix of t, and £7. 
its estimator. The diagonal elements of £7. can be obtained from (|27p . and the off-diagonals from 
(|28f) . Note that T,j- will be a diagonal matrix under the assumption of equal variance of potential 
outcomes for all treatment combinations. For < a < 1, a 100(1 — a)% (Neymanian) confidence 
region for t is: 

{+ :pi-«/2<T N < Pa/2 }, (30) 

where p a denotes the upper-a point of the asymptotic distribution of T N . 

From Corollary |4.2.1[ it follows that under the assumption of strict additivity, T N = (N/4)tt/S 2 , 
where S 2 is the estimator of the common variance S 2 . Substituting S 2 = J2z s 2 (z)/2 K , it is easy 
to see that, under strict additivity, T N is the ratio of the treatment mean sum of squares (MSTr) 
and residual mean sum of squares (MSR), as defined in the standard analysis of variance frame- 
work (e.g., Montgomery 2000). Following the work of Wilk (1955), it can be argued that if strict 
additivity holds, the randomization distribution of T N can be approximated by an F distribution 
with J — 1, N — J degrees of freedom. Under sampling from a normal super population when the 
inference is about the J — 1-vector of super-population estimands r instead of r, this result is 
exact and much more straightforward to prove. 

Also, under a normal approximation (known to be valid under strict additivity), 100(1 — a)% 
(Neymanian) confidence intervals for f j can be obtained as 

T. j ±Z a/2 S.E.(T. j ), (31) 

where S.E.(f.j) = \Jv&VN C y(T.j) denotes the estimated standard error of f.j, which can be obtained 
from ()27p . and z a denotes the upper a point of the standard normal distribution. 

Neymanian inference on individual or a collection of factorial effects under the proposed frame- 
work is essentially the same as the linear-model-based inference that has been widely used by 
experimenters and analysts for decades (see, for example, Wu and Hamada 2009; Box, Hunter and 
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Hunter 2005). However, as noted by Imbens and Rubin (2012, Ch. 6) for K = 1, it is imperative 
that under lack of additivity, the asymptotic confidence intervals given by (|30p and (|31|) are viewed 
as conservative in the sense that they are wider than the "true" intervals that involve the actual 
standard errors, which are overestimated in the Neymanian framework. Knowledge of the correla- 
tion matrix R, defined in (|24p . can shorten the confidence intervals by an amount that depends on 
the structure of R and its entries. To illustrate this aspect more formally using the example of the 
2 2 experiment discussed earlier, consider the following four correlation structures: 
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Note that in order to be positive definite, R\ must satisfy — 1/3 < p < 1 and R% must satisfy 
\pi + P2I < 1- For compound symmetry, R2 is positive definite for all — 1 < p < 1. Assume 
that in all the cases the potential outcomes have the same variance S 2 for all four treatment 
combinations. Table [T] shows the 95% confidence intervals (without any corrections for multiple 
testing) for the three factorial effects t.i,t.2 an d r.3. Lack of knowledge of R leads to the same 
confidence interval ±1.96 x 2-^= for each factorial effect on every occasion, whereas better inference 
(tighter confidence intervals) is possible if such knowledge can be incorporated by conditioning on 
covariates and replacing partial correlations for correlations. 



5 Fisherian randomization-based inference for 2 factorial designs 

Randomization tests are useful tools because they assess statistical significance of treatment effects 
from randomized experiments without making any assumption whatsoever about the distribution 
of the test statistic. Such tests can be used to test Fisher's sharp null hypothesis (see Rubin (1980)) 
of no factorial effect at unit levels, which is a much stronger hypothesis than the traditional one 
of no average factorial effects. Randomization tests have rarely been studied in the context of 
factorial experiments, except for the work by Loughin and Noble (1997), who studied such tests in 
the framework of a linear regression model for the observed response with additive error (in other 
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Table 1: Effect of lack of additivity of inference of factorial effects for a 2 2 design 



Correlation 
structure 


95% CI for ff 
perfect knowledge of R 


ictorial effects under 
assumption of strict additivity (R = R\ ) 


Ri 


±1.96 x 2-i= for all effects 

VN 


±1.96 x 2^= for all effects 

VN 


R 2 


±1.96 x V3 + p^= for all effects 


±1.96 x 2-^= for all effects 

VN 


R 3 


±1.96 x ^3 - for d 1 
±1.96 x V3±P^= for #2,012 


±1.96 x 2-f= for all effects 

VN 


R4 


±1.96 x ^3-( Pl -p 2 )^for U 
±lMx y /3-(p 2 -p 1 )j- for 2 , 
±1.96 x y/3 - (pi + p 2 )^= for 0i2 


±1.96 x 2^= for all effects 

VN 



words, invoking the assumption of strict additivity). In the following paragraphs, based on our 
proposed framework, we present a mathematical formulation for randomization-based inference 
procedures for a randomized 2 K factorial experiment. In particular, we aim at extending the 
procedure described by Imbens and Rubin (2012) to obtain "Fisherian Fiducial" intervals (Fisher 
1930; Wang 2000) for the J - 1 factorial effects f.j. 

To obtain Fisherian intervals, we first obtain estimates f.,- of the factorial effects from the 
observed data using (|14p . The next task is to obtain ranges of plausible values for each factorial 
effect under the stated randomization scheme, using the above estimates. This is done by (i) 
considering a sequence of sharp null hypotheses on the factorial effects, (ii) imputing the missing 
potential outcomes under each sharp null, (iii) computing p-values for each factorial effect using 
their randomization distributions and (iv) identifying ranges of values for the factorial effects for 
which the p values are not extreme. 

Let Tj = (Tji, . . . , Ti j—i)' denote the (J — l)-vector of unit-level factorial effects for unit i. 
Define the sharp null hypothesis 

H$:Ti = ri V i = l,...,N, 
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where rj = (771, . . . , Tfj—i)' is a vector of constants. Let G denote the J x J matrix whose columns 
are the vectors g , . . . ,f7j_i- Then from (pQ) it immediately follows that 

/ 

Tin 1 

(32) 



G' 



no 



where r^o = 2 K g' Yi denotes the mean of Y i. 

Recall that for the ith experimental unit, the experimenter observes only one potential outcome 
Y° hs . Let Yf ls denote the (J - l)-vector of the missing potential outcomes. Note that the rows of 
the J x K submatrix formed by columns 2 to K + 1 of G represents the J treatment combinations. 
Denote by g° hs the row of G that contains the treatment combination z assigned to unit i, and let 
the submatrix formed by the remaining J — 1 rows be Gf 1 ™. Then from (j32j) we can write 



\^obs 



1 i 




(33) 



Assuming that is true, imputation of the vector of missing potential outcomes 1^™ 1S requires 
two simple steps: 

1. Estimate t^o as fjo = Yj° bs — (l/2)g° bs/ 77, where g° hs is the vector g° bs without its first element 
(which is unity). 

2. Impute the missing potential outcomes for the ith unit using 

/ 



Y 1 - 




Having generated the complete set of potential outcomes for all ./V experimental units, we now 
compute the estimated factorial effects fj,j = 1, . . . , J — 1, for each possible assignment of iV ex- 
perimental units to the J treatment combinations, generating a randomization distribution for each 
effect. In practice, owing to the prohibitive number of possible arrangements, typically a random 
sample of the set of all possible assignments is obtained. From the randomization distribution, a 
p- value for each factorial effect is computed by adding the probabilities of all assignments that lead 
to a value as or more extreme than the value observed for the estimate of that factorial effect. 
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Let p(rjj) denote the p- value for the factorial effect T.j, j = 1, . . . , J — 1 under the sharp null 
hypothesis H 1 . The function p(r]j) is monotonic in r\j. Considering different sharp null hypotheses 



by varying rj, it is possible to obtain rjf and satisfying 







sup{r]j : p(i]j) < a/2}, 

m 

mfi-qj : p(r)j) > 1 - a/2}, 



(34) 
(35) 



for < a < 1. Then, [CfXj] represents 100(1 — a)% "Fisherian" interval for the factorial effect 
f -r 

We now illustrate the proposed approach using a simulated 2 2 experiment with N = 20 experi- 
mental units. The four potential outcomes for each unit are generated using a multivariate normal 
model with mean vector (10, 12, 13, 15)' and covariance matrix 

/ 1 .5 .5 .5 \ 
.5 1 .5 .5 
.5 .5 1 .5 
\ .5 .5 .5 1 J 

The true values of the three estimands are fx = 3, f.2 = 2 and r.3 = 0. One set of observed 
outcomes obtained using a completely randomized treatment assignment is shown in Table [2j From 
the observed outcomes, point estimates of the factorial effects are obtained as r.i = 2.98, f .2 = 1.74, 
and f .3 = 0.36. 

We formulate 100 different sharp null hypotheses of the form n = rj by choosing 100 different 
vectors 77 uniformly from the region = [— 6,6] 3 . For each of these sharp null hypotheses, the 
missing potential outcomes in Table [2] are imputed using the methodology described earlier in this 
Section. Then, a randomization distribution for each factorial effect is obtained by computing 
the effect from each of 1000 random draws from the set of possible randomizations. Figure [1] 
shows the randomization distributions of f.i, and r.2 under one of the 100 sharp null hypotheses 
T{ = (4.20,-2.22,0.81)', with the observed estimated values f \ and f.2 superimposed as vertical 
lines. The p- values for factorial effects f.i, and f.2 are obtained as 0.891 and 0.000 respectively. 

Proceeding in a similar manner, an array of 100 p-values are obtained for each of the three 
factorial effects. The plot of p(r)j) versus rjj are shown in Figure [2] for j = 1 and j = 2, i.e., for 
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Table 2: Observed outcomes for the 20 experimental units 
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Figure 1: Randomization distributions of r.i and r,i for rj = (4.20,-2.22,0.81) 

the two main effects. As described earlier and shown in the plots, the 95% fiducial limits for the 
factorial effects are obtained as the maximum and minimum values of r]j which are smaller than or 
equal to 0.025, and larger than or equal to 0.975 respectively. 

The Neymanian confidence intervals are computed using (|3ip . The 95% Fisherian and Neyma- 
nian confidence intervals for the three factorial effects are shown in Table [5j 

Table 3: 95% Neymanian and Fisherian "Fiducial' intervals for the three factorial effects 

Factorial effect True value Point estimate Neymanian 95% interval Fisherian 95% interval 
f.i 3 2.98 [1.93,4.03] [1.02,4.61] 
f.2 2 1.74 [0.69,2.78] [0.01, 3.65] 

f.3 0.36 [-0.69,1.40] [-1.38, 1.91] 



We observe that the Fisherian intervals are wider than the Neymanian intervals. However, note 
that the Fisherian intervals depend on the number of sharp-null hypotheses chosen, and also on 
the number of randomizations used in each iteration. Comparisons of Fisherian and Neymanian 
intervals need further investigation and will be reported in future work. Rubin (1984) provided 
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Figure 2: Plot of p(i]j) versus rjj for j = 1,2. 

the following Bayesian justification of the Fisherian approach to inference: it gives the posterior 
predictive distribution of the estimand of interest under a model of constant treatment effects and 
fixed units with fixed responses. Thus, a natural extension of the Fisherian approach is a Bayesian 
inference procedure described in the following section. 

6 A Bayesian framework 

The proposed Bayesian framework is based on Rubin (1978). The key idea is to develop an impu- 
tation model for the missing potential outcomes l Amis ) conditional on the observed outcomes Y oh * 
and the observed assignment vector W. Because the treatment effect is a function of (l^ obs , Y mis ), 
this plan will lead us to obtain the conditional posterior distribution of the factorial effects f.,-. To 
this end, let f(Y |G) denote a suitable probabilistic model for Yi, where is a vector of parameters 
endowed with a suitable prior distribution. For quantities a, b we use the notation [a\b] to denote 
the probability density function of a given b. Posterior inference for the factorial effects f.j entails 
the following steps: 

1. Obtain [@|"K obs , W], the posterior distribution for the parameters O conditional on the ob- 
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served data Y ohs and the observed assignment mechanism W using Bayes rule. 



2. Obtain [Y mis \Y ohs ,e,W], the conditional posterior distribution of Y mis given Y ohs , 6 and 
W. The distribution for [Y" mis |y obs , 6, W] is the posterior predictive distribution corre- 
sponding to the original model [y|6] (see Rubin(1978)). 

3. Next obtain the imputation model [y mis |y obs , VF] by marginalizing the posterior predictive 
distribution in step 2 over the parameter 0. 

4. Finally, obtain the posterior distribution for the factorial effects r,-. 

We now apply the above methodology to balanced factorial designs using a simple hierarchical 
model with conjugate prior distributions. We will also illustrate some connections between Bayesian 
inference and Neymanian inference discussed earlier in Section HI We can incorporate the linear 
model framework introduced in Section [J] into the following hierarchichal model: 



/ 



n 



p 
i 



\ 



V p 



(36) 



[n\a 2 ] ='N 2 K(ti ,a 2 /r I 2 K) , 



a 



IG(a,/3) . 



Thus the interclass correlation for the potential outcomes for each experimental unit is equal to p, 
which is considered known. Here IG denotes the inverse gamma distribution As discussed in 
Imbens and Rubin (2012), for moderate sample sizes the analysis will not be sensitive to the choice 
of priors as long as we are not too dogmatic in eliciting the prior distribution. 

The prior distribution in the hierarchical model (|36p is chosen to be conjugate and thus posterior 
inference and sampling from the posterior predictive distribution is quite straightforward. Recall 
Y ohs (x) from ((SJ). For 26Z, define the quantities 

rY ohs (z)+r of i (z) 



r + r 



1 The density of a quantity (j> distributed according to IG(a, 0) is proportional to <f> 



- a -x -4>/p 



24 



Also define the vector m 2 /f xl = (m(z)) Z £Z- The posterior distribution of the model parameters 
r, <t 2 are given by: 



a 2 



[n\Y° hs ,a 2 ,W]=N(m,-—I 2K ), 



2 V 2^r + r 

where s\z) = ^ £ i:W , (z)=1 (>? bs (z) " ? obs (*)) 2 - 

Note that (J3TJ) represents step 1 of the four steps involved in making posterior inference of f.j, 
as described earlier in this Section. After working through steps 2-4 (refer to the proof of Theorem 
16.11 in the Appendix), we arrive at the following result on the posterior distribution of the factorial 
effects: 

Theorem 6.1. Assume that the potential outcomes follow the model \36\) and fix a factorial effect 
f.j. Then, for a completely randomized assignment mechanism, 

E(f j \Y ohs ,W) = m j , (38) 
Var(f,|y^, W) = (lK( P ) + -^-(1 1 " P « 

where 



m i = i 1 ~ ~2ir) ^Td'jm + f.j , 



V denotes the posterior expectation of a 2 , given by 

V = E(a 2 \Y ohs , W) - 13 + ^ ^ Z S2(Z) + ^ m2( 



a + 2 K ~ 1 r - 1 
and k{p) = ((1 - p 2 ){2 K - 1) - 2/5(1 - p)(2 K ~ l - 1)) . 

Although the expressions for the posterior expectation and variance of f.j derived in Theorem 
16.11 appears somewhat complicated, they reduce to simple forms under specific conditions. This is 
clear from the following corollary, which also establishes the relationship between the Bayesian and 
Neymanian estimators of f.j. 
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Corollary 6.1.1. For the model IW\). 

lim E(f j \Y obs ,W)=T. j , 

TQ— >0 

lim lim Var(f AY * 3 , W) = -^(1 - -^) 

where f.j is the Neymannian estimator for the mean, defined by p4\ ), and Vi = ^ r r ^ jzJ2x s2 ( x ) ■ 

Remark 6.1. The connection between the Bayesian and Neymanian estimators of f.j, as seen 
from Corollary \6. 1. 1\ is quite intuitive. Note that ro can be thought of as a "prior sample size". 
Thus, the condition ro — > can be interpreted as lack of prior information on the mean vector of 
the potential outcomes, or alternatively, as a diffused prior for the mean vector fi. The condition 
a — > 1, /3 — >■ leads to a diffuse prior for the variance parameter a 2 . Collectively, these conditions 
reflect lack of prior information about the distribution of the potential outcomes. 

It is also worthwhile to note that, in the expression for Vi, the multiplicative term r - z ^- — > 1 
as the number of replicates r — > oo. Therefore, Vi — > s 2 as — > 1, where s 2 = 2~ K s 2 ( x ) 
is an unbiased estimator of the true common variance S 2 of all potential outcomes. Thus, as the 
number of replications r — > oo, the posterior variance of the factorial effect fj, corresponding to 
the reference prior derived in Corollary \ 6.1.1{ approaches the unbiased estimator Var^ ey {f,j\Y) 
defined by {21^ . Finally, for any finite r, Vi is smaller than its limit s 2 , which shows that the 
variance estimate for the finite population estimand is smaller than the variance estimate for the 
super population estimand. 

7 An example to demonstrate the advantages of the potential out- 
comes model in causal inference from 2 K factorial experiments 

We now demonstrate the effectiveness and versatility of the proposed framework with an example. 
Wu and Hamada (2009, Ch. 4) describes a 2 3 factorial experiment that was originally reported by 
Box and Bisgaard (1987) and conducted with the objective of reducing the percentage of cracked 
springs produced by a particular manufacturing process. Three factors - temperature of the quench- 
ing oil, carbon content (percentage) of the steel and the temperature of the steel before quenching 
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- were investigated. The response for each experimental unit (which represents a spring manufac- 
tured under a specific treatment combination z) is a binary random variable taking the value 1 if 
the spring does not crack, and zero otherwise. 

The data reported from the experiment (see Wu and Hamada 2009, Ch. 4), is actually a 
summary of the actual experimental data, and reports only the observed percentage of cracked 
springs for each of the 8 treatment combinations. Here, we assume that r = 100 springs were 
manufactured under each treatment combination in a completely randomized sequence (so that 
N = 800). Next, we simulate the potential outcomes for each of the 800 experimental units from a 
super-population model in which the potential outcome Yi{z) is drawn as a binary random variable 
taking value 1 with probability vr(z), where the probabilities, tt(z), which are given in Tabled! 
equal the observed proportion of non-cracked springs, reported in Wu and Hamada (2009, Ch. 
4). Finally we generate observed outcomes l^ obs for each experimental unit using a completely 
randomized treatment assignment. 



z 


(-1,-1,-1) 


(-1,-1,1) 


(-1,1,-1) 


(-1,1,1) 


(1,-1,-1) 


(1,-1,-1) 


(1,1,-1) 


(1,1,1) 


n(z) 


0.67 


0.79 


0.61 


0.75 


0.59 


0.90 


0.52 


0.87 



Table 4: Binary probabilities for the simulation model 

The traditional method of estimating factorial effects (Wu and Hamada 2009, Ch. 14) from 
these simulated experimental data is to fit a logistic regression model of the observed response 
on columns of the 800 x 8 design matrix (in which 0's are replaced by -l's, and columns for the 
interactions are obtained as products of the corresponding factor levels). The factorial effects are 
estimated as twice the coefficients of the fitted logistic model and are reported in Table [5j 

A natural question is, what factorial effects are the quantities in the second column of Table [5] 
estimating? Let 7r denote the vector of 7r(z)'s given by the second row of Tabled and ttl denote 
the vector of logistic transforms of tt(z)'s. Then, twice the coefficients of the logistic regression 
in the second column of Table [5] are unbiased estimates of superpopulation-level factorial effects 
(l/4W-7Ti, j = 1,...,7. However, instead of these estimands, the engineers may be interested 
in superpopulation-level factorial effects fj = {1/A)g ! --K, i.e., contrasts of vr(z), rather than their 
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Factorial effect 


Estimate 


Std. Error z 


-value 


Pr(> \z\) 


Main effect of 1 


0.1468 


0.1753 


0.84 


0.4022 


Main effect of 2 


-0.2492 


0.1753 


-1.42 


0.1551 


Main effect of 3 


1.2975 


0.1753 


7.40 


0.0000 


Interaction between 1 and 2 


0.0787 


0.1753 


0.45 


0.6534 


Interaction between 1 and 3 


0.5426 


0.1753 


3.10 


0.0020 


Interaction between 2 and 3 


-0.1060 


0.1753 


-0.61 


0.5452 


Three-way interaction 


0.1783 


0.1753 


1.02 


0.3092 


Table 5: Results 


from a loe 


fistic regression 


model 





logistic transformations. Point estimates of g'^n can be obtained as transformations of the estimated 
coefficients of the logistic regression. A more straightforward way of obtaining point estimates of 
(l/4)<7'-7r is to substitute p(z) = Y° bs (z) for 7r(z). Asymptotic standard errors of these estimators, 
given by 



We 



7T(Z)(1-7T(Z)) 



can again be estimated by substituting p(z) for tt(z). In the simulated data set, we have the 
estimated standard error as 0.0304 for each estimated factorial effect, which can be used to construct 
confidence intervals for the estimands. 

Finally, an experimenter may also be interested in finite population-level factorial effects ta = 
g'jP, j = 1, . . . , 7, which are contrasts of P(z) = X^=i Yi(z)/8Q0. Whereas such estimands may not 
be of much interest in an industrial or a manufacturing scenario, they can be meaningful if we think 
of a similar experiment in a different setting, e.g., with 800 schools in New York as experimental 
units, and the three factors as new interventions being investigated by the Department of Education 
(DOE). Note that the GLM framework and the foregoing asymptotic arguments do not permit us 
to perform inference for finite population estimands. The Bayesian framework explained in Section 
[6l however, permits us to perform inference for finite population estimands and superpopulation 
estimands separately. 

To illustrate such an analysis, we assume, for the sake of simplicity, that we are interested only 
in two factorial effects - the main effect of factor 3, represented by vector gr 3 , and the interaction 
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between factors 1 and 3, represented by vector g 5 - that appear to be significant from the earlier 
analysis. We now postulate a simple Hierarchical model. First, assume that all the potential 
outcomes for the ith, i = 1, ... ,N unit are independently distributed as binary random variables 
so that their joint distribution is given by 

f(YMz),zeZ) = J] irizf^il-TTiz)) 1 -^. 
zez 

Second, assume that given n(z) for all zeZ, the potential outcomes of all the N units are inde- 
pendently distributed, so that the joint distribution of the vector of the Nr potential outcomes is 
given by 

N 

f[Y\n(z),z E Z) = Y[f(Yi\ir(z),z€ Z). 

1=1 

Third, assume that 

ttl = ag + /3fl , 3 + 795' 

where (a, /3, 7) ~ iV(/i,S), and fi and S are assumed known. For demonstration purposes, we 
set = (1.048,0.6488,0.2713), where the first element 1.048 represents the intercept term of the 
logistic regression, and the other two are the coefficients of covariate vectors g 3 and g§ in the logistic 
regression (or half of the estimated factorial effects shown in Tabled]). Also, we choose 

^ 4 10 10 ^ 

10 100 50 
\ 10 50 100 j 

Finally, we note that, because the design is completely randomized, the distribution of W is 
independent of Y and it. 

To make super population-level inference, we simply need to sample from the posterior distri- 
bution 

/(a,/3, 7 |^° bs ,W) oc f(Y ohs \W,a,(3, 1 )f(W\a,(3, 1 )f( a ,f3, 1 ) 

cx /(a, 0, 7) Yl TTiz)^^^ YtiZ \l - 7r(z)) r-EliW 'iC z >= 1 Y,{Z) . 
z 

The superpopulation level estimands of interest, f | p and f | p , can be expressed as functions of a, /3 
and 7. After obtaining 10,000 draws from /(a, /3, 7|l^ obs , W) (each draw results in a value of the 
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superpopulation estimands) we obtain point estimates of r | and r| as their posterior means, and 
interval estimates as 95% credible intervals. 

To make finite population-level inference, we obtain the imputation model 

f(Y mis \Y ohs , W) = I f(Y mis \Y ohs ,a,P,~f,W)f(a,P,~f\Y obs ,W)da dp dj, 



where the finite-population estimands of interest T3 and r.5 can be expressed as functions of 

7, we obtain Y mis , and consequently obtain point and 
interval estimates of r.3 and f.5. The results are summarized in Table [6l 



Inference for Super Population 


Inference for Finite Population 


Estimand 


Inference using 
Regression Bayesian 


Estimand 


Inference using 
Regression Bayesian 


T .3 


0.24 [0.16, 0.32] 0.24 [0.16, 0.31] 




0.24 [0.16, 0.32] 0.24 [0.18, 0.30] 


T SP 

r .5 


0.10 [0.02, 0.17] 0.10 [0.01, 0.15] 


T.5 


0.10 [0.02, 0.17] 0.10 [0.05, 0.13] 



Table 6: Summary of inferences of factorial effects 3 and 13 for the super population and finite 
population 

The analysis described in this Section and the results in Table [6] reinforce the advantages of the 
proposed framework over the linear or generalized linear model based inference for factorial effects: 

1. It provides a clear understanding of the estimand. 

2. It allows for inference from finite populations, in addition to inference for super population. 
As seen from Table [6l the finite population credible intervals are tighter than the super 
population credible intervals. 

3. It permits making inference for estimands other than the mean of individual potential out- 
comes. In this example, suppose the experimenter is interested in the median of the unit-level 
factorial effects T%j ■ Because this estimand can be written as a function of (Y mis , Y ohs ), mak- 
ing such inference is straightforward. 
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8 Concluding Remarks 



In this article we have proposed a framework for causal inference from 2 factorial design using 
the concept of potential outcomes. Factorial effects are defined for a finite population, and the 
definitions are extended to estimands for a super population. We have discussed procedures for 
inferring these estimands under the Neymanian, Fisherian and Bayesian frameworks. Through these 
discussions and several examples, we have demonstrated how the utilization of potential outcomes 
results in better understanding of the estimands and allows greater flexibility in statistical inference 
of factorial effects, compared to the commonly used linear model based approach. 

Other than the benefits already demonstrated in this article, the proposed framework can pro- 
vide a foundation for addressing complex problems associated with the design and analysis of social, 
behavioral and medical experiments. Examples of these problems are: (i) unbalanced factorial ex- 
periments, (ii) factorial experiments with covariates, (ii) factorial experiments with randomization 
restrictions, (iii) semi-observational studies with a factorial structure (e.g., experimental units self- 
select the levels of one or more factors), and (iv) matched sampling in observational studies with a 
factorial structure. We are therefore hopeful that this article will open up a large number of research 
problems motivated by complex multi-factor experiments in social, medical and behavioral sciences 
and help experimenters in making more meaningful causal inference from such experiments. 



This research was supported by the U.S. National Science Foundation grant DMS-1107004, 
DMS-1 107070. We thank various colleagues and students for useful comments. 

Appendix 

Proof of Theorem \4-l\ We argued that Y ohs (z) defined by ([8]) is an unbiased estimator of Y(z). 
Therefore, the vector Y ohs in ()14[) is an unbiased estimator of Y defined by ©. Comparing (|14[) 
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Proof of Theorem \4.2\ Let gji denote the Ith element of vector gj, j = 0, . . . , J — 1, / = 1, . . . , J, 
and let z\ denote the Zth treatment combination, / = 1, . . . , J. From (I14p . we have 



Var(?.,) = 2-^ K ^g' 3 var{Y ohs ) Qj 

= 2- 2 ^\Y, 9% mr(y° bs (z / )) + EE 9ft 9ft' cov(Y° hs ( Zl ), F obs (^))| 
^ 1=1 x+v > 

= 2-^){j: ^^ 2 (^)-^EE 9ft 9j V S\ ZhZl ,)V 
^ 1=1 w J 



l^l 

the last step following from (|12p and f)13|) . Now, from f)16[) . we have 

AT 



%=i 

L_ 2 - 2 (^ 1) ^|^^ (yj(z/) _^ )) | 2 
i=i ^ i=i } 



N 

./ 



-2(K-1) 



E 9ftS 2 (z t ) + ^E 9ft 9 ft' S 2 (z h z v ) \. 
1=1 i+v > 



Consequently, 



Substituting (gOD into ([39]), we get 

J 



L 1 = 1 ' ■> 



the last step following from the fact that g\ = 1 for all j,l. <0> 



2 2{K 

- z, 

1 ' u - + Ift 

Proof of Theorem \4-3\ From (j!4[) , we have 

Cov(?.„?. J ,) = 2-^ K -^g' 3 var(Y ohs ) g f 

J 



(39) 



E E 9ft 9ft' S\z h z v ) = 2^S] - £ 9 2 ftS\ Zl ), (40) 
tyl' 1=1 



2 -2(K-l) g jm mr(y° bs (zO) +EE 9ft 9fl> cov(Y ohs ( Zl ), Y obs M) 

^ i=i i+v 

2- 2( ^ 1} {E S 2 (^)j, (41) 



1=1 l^V 
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the last step following from (|12p and (|13p . Now, from (|25p . we have 

JV 



4' = J^-^^Y.-g^YW^-g^Y) 

i=l 

i=l M=l ' ^ Z=l 

j E 9ji9j'iS 2 (zi) + 5j7 ffi'i' S 2 {z h z v ) f- 

^ 1=1 m' ' 



N 

-a(A--i) ; 



Consequently, 

_ J 
^ ^ 5,7 S 2 (z ; , 2,0 = 2 2 ^- 1 ),S 2 ? ., - £ 9jl9j'lS 2 (zi), ( 42 ) 



Substituting (@2D into (|4lj). we get 



*M E s\ z) - y, E * 2 (*) + E ^ 



22(K-^ 7 . 

and the proof is finished. 



zezjnzj/ zezjnzj, zez+pZT, zez+nz+ 



Proof of Equation h21\) . Each of the sets Z- and Z^~ have cardinality 2 k 1 . Thus ( 2 2 ) pahs of 
distinct treatment combinations can be formed within each of these sets. Because the total number 
of pairs of distinct combinations is ( 2 J , the number of pairs in the set Z ■ (| ZJ is ( 2 J — 2 ( 2 J. 
Consequently, from (JT9J) , we have 



r /2^ ^\ /2^ 



2/ +2 ^ 4 V 2 / V2 



5 2 , 



2 2(fc-l) 

and the result follows after simple algebraic manipulations. § 

Proof of Theorem \6.1\ Let z{i) denote the treatment combination the i th observation is assigned 
to. Then y'* ms |'j^°bs j s a multivariate Gaussian. For any z,z* / z(i), 

\Y?'(z)\Y?»,n,o*,W] =N( / x(z)+p(y, obs -,x(z(0)),cT 2 (l-p 2 )) , (43) 
Cov(Ff s (z), rf s (z*)|y, obs , /x, a 2 , W) = a 2 p(l - p) . 
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Define the quantities, 



1 



" a 2 



;i- V)^+V^ obs 



V 3 = T^Zu Hp) • 



2K 1 2 K yj 

22(JC-1) 

Using (|43|) and the definition of r^, for every j we see that is a Gaussian random variable with 
variance Vj for all i. Furthermore, conditional on /x, a 2 and Y' obs ) -f.j = -k YliLi T ij nas m ean rhj. 
Now we use the crucial fact that, and r^j are independent, conditional on n,a , Y obs for all 
i ^ i'. Thus it immediately follows that 

[f. j \Y° hs ,n,a 2 ,W}=N(m j ,±v j ). (44) 

Marginalizing over the posterior distribution of /x and a 2 , we see that the posterior of a factorial ef- 
fect f.j follows a i-distribution. Now the first claim follows from the fact that E(rj|y obs , /x, a 2 , W) = 
rhj (see (|37|) ) does not depend on a 2 . Thus by the law of iterated expectations, 



E{fj\Y obs , W) = E(rhj\Y obs , W) = mj 



Now we turn to the variance of tj. Since Var(E(r ,j\Y obs , a 2 , W)) = 0, it follows that Var(r j|l" obs , W) 

gain, using the variance formula for iterated expectations, 

Var(f. j\Y obs , a 2 , W) = E(Yai(fj\Y oh8 , /x, a 2 , W)) + Var(E(f. j\Y obs , n, a 2 , W)) 

1 g 2 , / x a 2 2 K . 1-Px2 
= N^T) k ^ + ^-Dr + J 1 " "2ir) (45) 

where for the last term we have used (|37p . Now the claim follows from (|45j) and the fact that 
E(o- 2 |y obs , W}) = V, finishing the proof. 



Proof of Corollary \4-2.S\ From Theorem 16.11 lim ro _>o ^( r . j|Y obs , W) = lim rQ ^.Q nij. 



lim m 3 = lim (l - 1^) * </ TO + ^ 



, 1- p ^ 1-p. 



T.i = TA 



2K ) ' -3 1 2 K ' ' 3 

where in the penultimate step, we used the fact that lim ro _j.o m ( z ) = Y obs (z), proving the first 
claim. For the variance, using Theorem l6 . 1 1 and straightforward algebra using the fact lim^^i^^o lim ro _>o V 
Ve yields the result. 
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